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(57) Abstract: A system (10) used with a virtual device (50) or transfers information to a companion device (80), and includes 
IMW optical systems OS I (20). OS2 (60). In a structured-light embodiment, OSl (20) emits a fan beam plane (30) of optical energy 
parallel to and above the virtual device (50). When a user object (110) penetrates the beam plane of interest, 0S2 (60) registers the 
event Triangulation methods can locale the virtual contact, and transfer user-intended information to the companion system (80, 90). 
In a non-strucrared light embodiment. OSl (20) is preferably a digital camera whose filed of view defines the plane of interest, which 
is illuminated by an active source of optical energ>'. Preferably the ac^ve source, OSl (20), and OS2 (60) operate synchronously to 
reduce effects of ambient light A non-stniciured passive light embodiment is smaller except the source of optical energy is ambient 
light A substraction technique preferably enhances the signal/noise ratio. Tbe companion device (80) may in fact house the present 
invention. 
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QUASI-THREE-DIMENSIONAL METHOD AND APPARATUS TO DETECT 
AND LOCALIZE INTERACTION OF USER-OBJECT AND VIRTUAL 

TRANSFER DEVICE 

5 

RELATION TO PREVIOUSLY FILED APPLICATION 
Priority is claimed from applicants' co-pending U.S. provisional patent applica- 
tion serial no. 60/287,1 15 filed on 27 April 2001 entitled "Input Methods Using 
Planar Range Sensors", from co-pending U.S. provisional patent application 

10 serial no. 60/272,120 filed on 27 February 2001 entitled "Vertical Triangula- 
tion System for a Virtual Touch-Sensitive Surface", and from co-pending U.S. 
provisional patent application serial no. 60/231,184 filed on 7 September 
2000 entitled "Application of Image Processing Techniques for a Virtual 
Keyboard System". Further, this application is a continuation-in-part from co- 

15 pending U.S. patent application serial no. 09/502.499 filed on 11 February 
2000 entitled "Method And Apparatus for Entering Data Using A Virtual Input 
Device". Each of said applications is incorporated herein by reference. 

FIELD OF THE INVENTION 
20 The invention relates generally to sensing proximity of a stylus or user finger 
relative to a device to input or transfer commands and/or data to a system, 
and more particularly to such sensing relative to a virtual device used to input 
or transfer commands and/or data and/or other information to a system. 

25 BACKGROUND OF THE INVENTION 

It is often desirable to use virtual input devices to input commands and/or 
data and/or transfer other information to electronic systems, for example a 
computer system, a musical instrument, even telephones. For example, 
although computers can now be implemented in almost pocket-size, inputting 

30 data or commands on a mini-keyboard can be time consuming and enx>r 

prone. While many cellular telephones can today handle e-mail communica- 
tion, actually inputting messages using the small telephone touch pad can be 
difficult. 

35 For example, a PDA has much of the functionality of a computer but suffers 
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from a tiny or non-existent keyboard. If a system could be used to determine 
when a user's fingers or stylus contacted a virtual keyboard, and what fingers 
contacted what virtual keys thereon, the output of the system could perhaps 
be input to the PDA in lieu of keyboard infomrtation. (The terms Tinger^ or 

5 Tingers", and "stylus" are used interchangeably herein.) In this example a 
virtual keyboard might be a piece of paper, perhaps that unfolds to the size of 
a keyboard, with keys printed thereon, to guide the user's hands. It is under- 
stood that the virtual keyboard or other input device is simply a work surface 
and has no sensors or mechanical or electronic components. The paper and 

10 keys would not actually input infomiation, but the interaction or interface 

between the user's fingers and portions of the paper, or if not paper, portions 
of a work surface, whereon keys would exist, could be used to input infomna- 
tion to the PDA. A similar virtual device and system might be usefiji to input 
e-mail to a cellular telephone. A virtual piano-type keyboard might be used to 

1 5 play a real musical instalment The challenge is how to detect or sense 
where the user's fingers or a stylus are relative to the virtual device. 

U.S. patent 5,767,848 to Korth (1998) entitled "Method and Device For 
Optical Input of Commands or Data" attempts to implement virtual devices 

20 using a two-<limensional TV video camera. Such optical systems rely upon 
luminance data and require a stable source of ambient light, but unfortunately 
luminance data can confijse an imaging system. For example, a user's finger 
in the image foreground may be indistinguishable from regions of the back- 
ground. Further, shadows and other image-blocking phenomena resulting 

25 from a user's hands obstructing the virtual device would seem to make 

implementing a Korth system somewhat imprecise in operation. Korth would 
also require examination of the contour of a user's fingers, finger position 
relative to the virtual device, and a determination of finger movement. 

30 U.S. Patent no. to Bamji et al. (2001 ) entitled "CMOS-Compatible 

Three-Dimensional Image Sensor IC", application serial no. 09/406,059, filed 
22 September 1999, discloses a sophisticated three-dimensional imaging 
system usable with virtual devices to input commands and data to electronic 
systems. In that patent, various range finding systems were disclosed, which 

35 systems could be used to detemnine the interface between a user's fingertip 
and a virtual input device, e.g.. a keyboard. Imaging was detennined in three- 
dimensions using time-of-fiight measurements. A light source emitted optical 



-2. 



1 



wo 02/21502 



PCT/USOl/28094 



energy towards a target object, e.g., a virtual device, and energy reflected by 
portions of the object within the imaging path was detected by an an^y of 
photodiodes. Using various sophisticated techniques, the actual time-of-flight 
between emission of the optical energy and its detection by the photodiode 

5 array was detemnined. This measurement permitted calculating the vector 
distance to the point on the target object in three-dimensions, e.g., (x,y,2). 
The described system examined reflected emitted energy, and could function 
without ambient light. If for example the target object were a layout of a 
computer keyboard, perhaps a piece of paper with printed keys thereon, the 

1 0 system could determine which user finger touched what portion of the target, 
e.g., which virtual key, in what order. Of course the piece of paper would be 
optional and would be used to guide the user's fingers. 

Three-dimensional data obtained with the Bamji invention could be software- 
15 processed to localize user fingers as they come in contact with a touch 
surface, e.g., a virtual input device. The software could identify finger 
contact with a location on the surface as a request to input a keyboard event 
to an application executed by an associated electronic device or system (e.g., 
a computer. PDA, cell phone, Kiosk device, point of sale device, etc.). 
20 While the Bamji system worked and could be used to input commands and/or 
data to a computer system using three-dimensional imaging to analyze the 
Interface of a user's fingers and a virtual input device, a less complex and 
perhaps less sophisticated system is desirable. Like the Bamji system, such 
new system should be relatively inexpensive to mass produce and should 
25 consume relatively little operating power such that battery operation is 
feasible. 

The present invention provides such a system. 

30 SUMMARY OF THE PRESENT INVENTION 

The present invention localizes interaction between a user finger or stylus and 
a passive touch surface (e.g., virtual input device), defined above a work 
surface, using planar quasi-three-dimensional sensing. Quasi-three-dimen- 
sional sensing implies that detemiination of an interaction point can be made 

35 essentially in three dimensions, using as a reference a two-dimensional 

surface that is arbitrarily oriented in three-dimensional space. Once a touch 
has been detected, the invention localizes the touch region to detemnine 

-3- 
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where on a virtual input device the touching occun^d. and what data or 
command keystroke, corresponding to the localized region that was touched, 
is to be generated In response to the touch. Alternatively, the virtual input 
device might include a virtual mouse or trackball. In such an embodiment, 

5 the present invention would detect and report coordinates of the point of 
contact with the virtual input device, which coordinates would be coupled to 
an application, periiaps to move a cursor on a display (in a virtual mouse or 
trackball implementation) and/or to lay so-called digital ink for a drawing or 
writing application (virtual pen or stylus implementation). In the various 

10 embodiments, triangulation analysis methods preferably are used to deter- 
mine where user-object "contact" with the virtual input device occurs. 

In a so-called structured-light embodiment, the invention includes a first 
optical system (0S1 ) that generates a plane of optical energy defining a fan- 

15 beam of beam angle (}) parallel to and a small stand-off distance AY above 
the work surface whereon the virtual input device may be defined. In this 
embodiment, the plane of interest is the plane of light produced by OS1 , 
typically a laser or LED light generator. The two parallel planes may typically 
be horizontal, but they may be disposed vertically or at any other angle that 

20 may be convenient. The invention further includes a second optical system 
(0S2) that is responsive to optical energy of the same wavelength as emitted 
by 0S1 . Preferably OS2 is disposed above OS1 and angled with offset 6. 
relative to the fan-beam plane, toward the region where the virtual input 
device is defined. 0S2 is responsive to energy emitted by 0S1 , but the 

25 wavelength of the optical energy need not be visible to humans. 

The invention may also be implemented using non-structured-light configura- 
tions that may be active or passive. In a passive triangulation embodiment, 
0S1 is a camera rather than an active source of optical energy, and 0S2 is a 

30 camera responsive to the same optical energy as 0S1 , and preferably 

disposed as described above. In such embodiment, the plane of interest is 
the projection plane of a scan line of the OS1 camera. In a non-stnjctured- 
light embodiment such as an active triangulation embodiment, OS1 and 0S2 
are cameras and the invention further includes an active light source that 

35 emits optical energy having wavelengths to which 0S1 and 0S2 respond. 
Optionally in such embodiment. 0S1 and 0S2 can each include a shutter 
mechanism synchronized to output from the active light source, such that 



-4- 



wo 02/21502 



PCT/USOl/28094 



Shutters in 0S1 and 0S2 are open when optical energy is emitted, and are 
othenwise closed. An advantage of a non-structured light configuration using 
two cameras is that bumps or Irregularities in the work surface are better 
tolerated. The plane defined by 0S1 may be selected by choosing an 
5 appropriate row of 0S1 sensing pixel elements to confomn to the highest y- 
dimension point (e.g.. bump) of the work surface. 

In the structured-light embodiment. OS2 will not detect optical energy until an 
object, e.g.. a user finger or stylus, begins to touch the work surface region 

1 0 whereon the virtual input device is defined. However, as soon as the object 
penetrates the plane of optical energy emitted by 0S1 , the portion of the 
finger or stylus intersecting the plane will be illuminated (visibly or invisibly to 
a user). 0S2 senses the intersection with the plane of interest by detecting 
optical energy reflected towards 0S2 by the illuminated object region. 

1 5 Essentially only one plane is of interest to the present invention, as deter- 
mined by configuration of 0S1, and all other planes definable in three-dimen- 
sional space parallel to the virtual input device can be ignored as irelevant. 
Thus, a planar three-dimensional sensor system senses user interactions with 
a virtual input device occuning on the emitted fan-beam plane, and ignores 

20 any interactions on other planes. 

In this fashion, the present invention detects that an object has touched the 
virtual input device. Having sensed that a relevant touch-intersection is 
occun-ing, the invention then localizes in two-dimensions the location of the 

25 touch upon the plane of the virtual device. In the prefen-ed implementation, 
localized events can include identifying which virtual keys on a virtual com- 
puter keyboard or musical keyboard are touched by the user. The user may 
touch more than one virtual key at a time, for example the "shitt" key and 
another key. Note too that the time order of the touchings is detemiined by 

30 the present invention. Thus, if the user touches virtual keys for "shift" and T. 
and then for the letters "h" and then "e". the present invention will recognize 
what is being input as T" tiien "h" and then "e". or The". It will be appreci- 
ated that the present invention does not rely upon ambient light, and thus can 
be fully operative even absent ambient light, assuming that the user knows 

35 the location of the virtual input device. 

Stmctured-light and/or non-stmctured light passive b-iangulation methods may 
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be used to determine a point of contact (x,z) between a user's hand and the 
sense plane. Since the baseline distance B between OS1 and 0S2 is Icnown, 
a triangle is fomied between 0S1, 0S2 and point (x,z), whose sides are B, 
and projection rays R1 and R2 from 0S1 , 0S2 to (x,z). 0S1 and 0S2 allow 
5 determination of triangle angular distance from a reference plane, as well as 
angles a, and ot^ fomied by the projection rays, and trigonometry yields 
distance z to the surface point (x,z). as well as projection ray lengths. 

A processor unit associated with the present invention executes software to 
10 identify each intersection of a user-controlled object with the virtual input 
device and detenmines therefrom the appropriate user-intended input data 
and/or command, preferably using triangulation analysis. The data and/or 
commands can then be output by the present Invention as input to a device or 
system for which the virtual input device is used. If desired the present 
1 5 invention may be implemented within the companion device or system. 

especially for PDAs, cellular telephones, and other small form factor device or 
systems that often lack a large user input device such as a keyboard. 

Other features and advantages of the invention will appear from the following 
20 description in which the prefen^ed embodiments have been set forth in detail, 
in conjunction with their accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 A depicts a planar quasi-three-dimensional detection structured-light 
25 system used to detect user input to a virtual input device, according to the 
present invention; 

FIG. 1B depicts a planar quasi-three-dimensional detection non-structured 
active light system used to detect user input to a virtual input device, accord- 
30 ing to the present invention; 

FIG. 10 depicts a planar quasi-three-dimensional detection non-stnjctured 
passive light system used to detect user input to a virtual input device, 
according to the present invention; 

35 

FIG. 2A depicts geometry associated with location determination using 
triangulation, according to the present invention; 
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FIG. 2B depicts use of a spaced-apart optical emitter and reflector as a first 
optical system, according to the present invention; 

FIGS. 3A-3E depict design tradeoffs associated with varying orientations of 
5 0S2 sensor, 0S2 lens, and detection plane upon effective field of view and 
image quality, according to the present invention; 

Fig. 4 is a block diagram depicting functions canied out by a processor unit in 
the exemplary system of Fig. 1B. according to an embodiment of the present 
10 invention; 

Fig. 5A depicts an embodiment wherein the virtual device has five user- 
selectable regions and the companion device is a monitor, according to the 
present invention; . 

15 

Fig. 5B depicts an embodiment wherein the virtual device is a computer 
Iceyboard and the companion device is a mobile transceiver, according to the 
present invention; 

20 Fig. 5C depicts an embodiment wherein the virtual device is mounted or 
projected on a wall and the companion device is a monitor, according to the 
present invention; 

FIG. 6 depicts planar range sensing, acconjing to the present invention; and 

25 

FIG. 7 depicts coordinate distance measurements used in an exemplary 
calculation of touch location for use in outputting con^esponding infonnation or 
data or command, according to the present invention. 

30 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

Fig. 1 A depicts a prefen-ed embodiment of a quasi-planar three-dimensional 
sensing system 10 comprising, in a stnjctured-light system embodiment, a 
first optical system (OS1) 20 that emits a fan-beam plane 30 of optical energy 
parallel to a planar work surface 40 upon which there is defined a virtual input 

35 device 50 and/or 50' and/or 50". Preferably the fan-beam defines a fan angle 
(J), and is spaced-apart from the work surface by a small stand-off distance 
AY. Any object (e.g., a user finger or stylus) attempting to touch the work 

-7- 
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surface must first contact the fan-beam and will thereby be Illuminated (visibly 
or not visibly) with emitted optical energy. While fan-beam plane 30 and the 
work surface plane 40 are shown horizontally disposed In Fig. 1A, these two 
planes may be disposed vertically or indeed at any other angle that may be 
5 desired for a system. Note that, without limitation, work surface 40 could be a 
portion of a work desk, a table top. a portion of a vehicle, e.g., a tray in an 
airplane, a windshield or dashboard, a wall, a display including a projected 
image, or a display such as a CRT, an LCD, etc. As used herein, the term 
"plane" will be understood to include a subset of a full plane. For example, 
10 fan-beam plane 30 will be temied a plane, even though it has finite width and 
does not extend infinitely in all directions. 

By Virtual input device" it is meant that an image of ah Input device may be 
present on work surface 40, perhaps by placing a paper bearing a printed 
1 5 image, or perhaps system 1 0 projects a visible image of the input device onto 
the work surface, or there literally may be no image whatsoever visible upon 
work surface 40. As such, virtual input device 50, 50', 50" requires no 
mechanical parts such as working keys, and need not be sensitive to a touch 
by a finger or stylus: in short, the virtual input device preferably is passive. 

20 

In the example of Fig. 1 A, virtual input device 50 is a computer-type keyboard 
that may be full sized or scaled up or down from an actual sized keyboard. If 
desired the virtual input device may comprise or include a virtual trackball 50' 
and/or a virtual touchpad 50". When system 10 is used with a virtual key- 
25 board input device 50, or virtual trackball 50' or virtual touchpad 50", a fan 
angle (|) of about 50* to 90° and preferably about 90* will ensure that fan 
beam 30 encompasses the entire virtual input device at distances commonly 
used. Further, for such a virtual input device, a stand-off distance AY of up to 
a few mm works well, preferably about 1 mm. 

.30 

System 1 0 further includes a second optical system (OS2) 60, typically a 
camera with a planar sensor, that is preferably spaced apart from and above 
0S1 20, and inclined toward work surface 40 and plane 30 at an angle 9, 
about 10° to about 90', and preferably about 25". System 10 further includes 
35 an electronic processing system 70 that, among other tasks, supervises 0S1 
and 0S2. System 70 preferably includes at least a central processor unit 
(CPU) and associated memory that can Include read-only-memory (ROM) 
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and random access memory (RAM). 

In Fig. 1A, system 10 elements 0S1 20, 0S2 60, and processor unit 70 are 

5 shown disposed on or in a device 80. Device 80 may be a stand-alone 
implementation of system 10 or may in fact be a system or device for which 
virtual input device 50 is used to input data or commands. In the latter case, 
device 80 may, without limitation, be a computer, a PDA (as shown in Fig. 
1 A), a cellular telephone, a musical instmment, etc. If system or device 80 is 

1 0 not being controlled by the virtual input device, the device 90 being so 
controlled can be coupled electrically to system/device 80 to receive data 
and/or commands input from virtual device 50. Where the virtual device is a 
trackball (or mouse) 50' or touchpad 50", user interaction with such virtual 
device can directly output raw information or data comprising touch coordi- 

15 nates (x,z) for use by device 80. For example, user interaction with virtual 
input device 50* or 50" might reposition a cursor 160 on a display 140, or 
otherwise alter an application executed by device 80, or lay down a locus of 
so-called digital ink 180 that follows v/hat a user might "write" using a virtual 
mouse or trackball 50', or using a stylus 120' and a virtual touchpad 50". 

20 System/device 90 can be electrically coupled to system 80 by a medium 100 
that may without limitation include wire(s) or be wireless, or can be a network 
including the internet. 

In a structured-light embodiment, 0S1 20 emits optical energy in fan-beam 
25 30, parallel to the x-z plane 30. 0S1 may include a laser line generator or an 
LED line generator, although other optical energy sources could be used to 
emit plane 30. A line generator 0S1 is so called because it emits a plane of 
light that when intersected by a second plane illuminates what OS2 would 
view as a line on the second plane. For example if a cylindrical object 
30 intersected plane 30, 0S2 would see the event as an illuminated portion of an 
elliptical arc whose aspect ratio would depend upon distance of 0S2 above 
plane 30 and surface 40. Thus, excluding ambient light, detection by 0S2 of 
an elliptical arc on plane 30 denotes a touching event, e.g., that an object 
such as 120R has contacted or penetrated plane 30. Although a variety of 
35 optical emitters may be used, a laser diode outputting perhaps 3 mW average 
power at a wavelength of between 300 nm to perhaps 1 ,000 nm could be 
used. While ambient light wavelengths (e.g., perhaps 350 nm to 700 nm) 
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could be used, the effects of ambient light may be minimized without filtering 
or shutters if such wavelengths are avoided. Thus, wavelengths of about 600 
nm (visible red) up to perhaps 1.000 nm (deep infrared) could be used. A 
laser diode outputting 850 nm wavelength optical energy would represent an 
5 economical emitter, although 0S2 vrauld preferably include a filter to reduce 
the effects of ambient light. 

While 0S1 preferably is stationary in a structured light embodiment, it is 
understood that a fan-beam 30 could be generated by mechanically sweeping 

1 0 a single emitted line of optical energy to define the fan-beam plane 30. As 
shown in Fig. 2B, 0S1 may in fact include an optical energy emitter 20-A that 
emits a fan beam, and a reflecting mirror 20-B that directs the fan beam 30 
substantially parallel to surface 40. For purposes of the present invention, in 
a structured light embodiment, optical energy emitted by OS1 20 may be 

15 visible to humans or not visible. 0S2 60 preferably includes a camera system 
responsive to optical energy of the wavelength emitted by 0S1 20. By 
"responsive" It is meant that 0S2 recognizes energy of the same wavelength 
emitted by 0S1 . and ideally will not recognize or respond to energy of sub- 
stantially differing wavelength. For example, 0S2 may include a filter system 

20 such that optical energy of wavelength other than that emitted by 0S1 is not 
detected, for example a color filter. 

If desired, 0S2 could be made responsive substantially solely to optical 
energy emitted from OS1 by synchronously switching 0S1 and 0S2 on and - 

25 off at the same time, e.g.. under control of unit 70. 0S1 and OS2 preferably 
would include shutter mechanisms, depicted as elements 22, that would 
functionally open and close in synchronized fashion. For example, electronic 
processing system 70 could synchronously switch^n 0S1 . 0S2, or shutter 
mechanisms 22 for a time period t^ with a desired duty cycle, where t^ is 

30 perhaps in the range of about 0. 1 ms to about 35 ms, and then switch-off 
0S1 and 0S2. If desired, 0S1 could be operated at all times, where plane 
30 is permitted to radiate only when shutter 22 in front of 0S1 20 is open. In 
the various shutter configuration, repetition rate of the synchronous switching 
is preferably in the range of 20 Hz to perhaps 300 Hz to promote an adequate 

35 rate of frame data acquisition. To conserve operating power and reduce 
computational overhead, a repetition rate of perhaps 30 Hz to 100 Hz repre- 
sents an acceptable rate. Of course other devices and methods for ensuring 
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that 0S2 responds substantially only to optical energy emitted by 0S1 may 
also be used. For ease of illustration shutters 22 are depicted as mechanical 
elements, but In practice the concept of shutters 22 is understood to include 
turning on and off light sources and cameras in any of a variety of ways. 

5 

If desired, source(s) of optical energy used with the present invention could 
be made to cany a so-called signature to better enable such energy to be 
discerned from ambient light energy. For example and without limitation, 
such sources might be modulated at a fixed frequency such that cameras or 

10 other sensor units used with the present invention can more readily recognize 
such energy while ambient light energy would, by virtue of lacking such 
signature, be substantially rejected. In short, signature techniques such as 
selecting wavelengths for optical energy that differ from ambient light, tech- 
niques that involve synchronized operation of light sources and camera 

1 5 sensors, and modulating or othenwise tagging light source energy can all 
improve the signal/noise ratio of infonnation acquired by the present inven- 
tion. 

Note that there is no requirement that work surface 40 be reflective or non- 
20 reflective with respect to the wavelength emitted by 0S1 since the fan-beam 
or other emission of optical energy does not reach the surface per se. Note 
too that preferably the virtual input device is entirely passive. Since device 50 
is passive, It can be scaled to be smaller than a full-sized device. If neces- 
sary. Further, the cost of a passive virtual input device can be nil, especially if 
25 the "device" is simply a piece of paper bearing a printed graphic image of an 
actual input device. 

In Fig. 1 A, assume initially that the user of system 10 is not in close proximity 
to virtual input device 50. In a staictured-light embodiment, although 0S1 

30 may emit optical energy fan-beam plane 30, 0S2 detects nothing because no 
object intersects plane 30. Assume now that a portion 1 1 0 of a finger of a 
user's left or right hand 120L, 120R moves downward to touch a portion of 
the area of work surface 40 whereon the virtual input device 50 is defined. 
Alternatively, portion 110' of a user-controlled stylus 120' could be moved 

35 downward to touch a relevant portion of work surface 40. Within the context 
of the present Invention, a touch Is interpreted by software associated with 
the Invention as a request to send a keyboard event to an application mnning 
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on a companion device or system 80 or 90. e.g., notebook. PDA, cell phone. 
Kiosk device, point of sale device, etc. 

In Fig. 1A. as the user's finger moves downward and begins to intersect 
5 optical energy plane 30. a portion of the finger tip facing OS1 will now reflect 
optical energy 130. At least some reflected optical energy 130 will be de- 
tected by 0S2, since the wavelength of the reflected energy is the same as 
the energy emitted by 0S1 . and 0S2 is responsive to energy of such wave- 
length. Thus, planar quasi-three-dimensional sense system 10 detects 
1 0 optical energy reflected by the interaction of a user controlled object (e.g., a 
finger, a stylus, etc.) occuning at a plane of interest defined by fan-beam 
plane 30. Any interaction(s) that may occur on any other plane are deemed 
not relevant and may be ignored by the present invention. 

1 5 Thus, until an object such as a portion of a user's hand or perhaps of a stylus 
intersects the optical energy plane 30 emitted by 0S1 20, there will be no 
reflected optical energy 130 for 0S2 60 to detect. Under such conditions, 
system 10 knov^rs that no user input is being made. However as soon as the 
optical energy plane is penetrated, the intersection of the penetrating object 

20 (e.g.. fingertip, stylus tip. etc.) is detected by 0S2 60, and the location (x.z) of 
the penetration can be determined by processor unit 70 associated with 
system 10. In Fig. 1A, if the user's left forefinger is touching the portion of 
virtual input device 50 defined as co-ordinate (x7,z3), then software associ- 
ated with the invention can detemiine that the letter T has been "pressed". 

25 Since no "shift key" is also being pressed, the pressed letter would be under- 
stood to be lower case T. 

In the embodiment shown, system 10 can generate and input to system 80 or 
90 keystrokes representing data and/or commands that a user would have 

30 entered on an actual keyboard. Such input to system 80 or 90 can be used to 
show infonnation 140 on display 150. as the information is entered by the 
user on virtual input device 50. If desired, an enlarged cursor region 160 
could be implemented to provide additional visual input to aid the user who is 
inputting infomnation. If desired, processor unit 70 could cause system 80 

35 and/or 90 to emit audible feedback to help the user. e.g.. electronic keyclick 
sounds 170 con-esponding with the "pressing" of a virtual key on virtual input 
device 50. It is understood that if system 80 or 90 were a musical instrument 
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rather than a computer or PDA or cellular telephone, musical sounds 170 
would be emitted, and virtual input device 50 could instead have the configu- 
ration similar to a piano Iceyboard or keyboards associated with synthetic 
music generators. 

5 

Fig. 1B depicts a non-stnjctured active light system 10, in which a camera 20' 
in a first optical system 0S1 defines a plane of interest 30' that in essence 
replaces plane 30 defined by optical emitter 0S1 in the embodiment of Fig. 
1A. Camera 20' 0S1 preferably is similar to camera 60 0S2, which may be 

10 similar to camera 60 0S2 in the embodiment of Fig. 1 A. For example, 0S1 
20' may have a sensor .array that comprises at least one line and preferably 
several lines of pixel detector elements. The embodiment of Fig. 1B is active 
in that one or more light sources 190, disposed intermediate 0S1 20* and 
0S2 60 generate optical energy of a wavelength that is detectable by camera 

15 0S1 20' and by camera 0S2 60. To reduce the effects of ambient light upon 
detection by cameras 0S1 and 0S2, preferably each camera and each 
optical energy emitter 190 operates in cooperation with a shutter mechanism, 
preferably synchronized, e.g., by unit 70. Thus, during the times that shutters 
22 perniit optical energy from emitter 190 to radiate towards the virtual input 

20 device 50. 50', 50", similar shutters 22 will pemriit cameras 0S1 and OS2 to 
detect optical energy. The interaction of user-object, e.g., 120L with plane 30' 
is detected by OS1 and by 0S2. The location of the point of intersection is 
then calculated, e.g., using triangulation methods described later herein. 

25 In Fig. 1 B, a bump or in-egularity in the plane of work surface 40 is shown 
near the point of contact 1 1 0 with the user-object 1 20L. An advantage of the 
presence of second camera OS1 20' is that the plane of interest 30' may be 
selected, perhaps by unit 70, to lie just above the highest irregular portion of 
work surface 40. If irregularities were present in work surface 40 in the 

30 embodiment of Fig. 1 A, it would be necessary to somehow reposition the 
laser plane 30 relative to the work surface. But in Fig. 1 B, the effect of such 
repositioning is attained electronically simply by selecting an appropriate line 
of pixels from the detector an'ay with 0S1 20'. 

35 ' Note that the configuration of Fig. 1B lends itself to various methods to 

improve the signal/noise ratio. For example, shutters 22 can permit cameras 
0C1 and OS2 to gather image data during a time that emitters 190 are turned 

-13- 



BNSOOCID: <WO_0221S02A1_L> 



wo 02/21502 



PCTAJSOl/28094 



off. e.g., by control unit 70. Any image data then acquired by OS1 and/or 
0S2 will represent background noise resulting from ambient light (Again it is 
understood that to minimize effects of ambient light, emitters 190 and cam- 
eras 0S1 , 0S2 preferably operate at a wavelength regime removed from that 

5 of ambient light.) Having acquired what might be termed a background noise 
signal, cameras OS1 and 0S2 can now be operated nomnally and in synchro- 
nism with emitter(s) 190. Image data acquired by cameras 0S1 and 0S2 in 
synchronism with emitter(s) 190 will include actual data, e.g., user-object 
interface with plane 30\ plus any (undesired) effects due to ambient light. 

10 Processor unit 70 (or another unit) can then dynamically subtract the back- 
ground noise signal from the actual data plus noise signal, to anive at an 
actual data signal, thus enhancing the signal/noise ratio. 

Fig. 1C depicts a non-structured passive embodiment of the present mven- 
15 tion. System 1 0 in Fig. 1 C is passive in that whatever source 1 95 of ambient 
light is present provides optical energy used during imaging. Similar to 
system 10 in Fig. 1 B, 0S1 is a camera 20' that defines a plane of Interest 30'. 
and 0S2 Is a camera 60. Typically plane 30' will be defined a distance AV 
above work surface 40, typically a distance of a few mm. User-object interac- 
20 tion with plane 30" is detected by 081 and OS2, using optical energy from 
ambient light source 195. Trianguiation methods may then be used to 
localize the point of interaction or intersection with plane 30\ as described 
elsewhere herein. 

25 Fig. 2A depicts the geometry with which location (x,z) of the intercept point 
between a user's finger or object 120R and plane 30 may be detemnined 
using trianguiation. Fig. 2A and Fig. 2B may be used to describe analysis of 
the various embodiments shown in Figs. 1A-1C. 

30 As used herein, trianguiation helps detemiine the shape of surfaces in a field 
of view of interest by geometric analysis of triangles formed by the projection 
rays, e.g., R1. R2 of two optica! systems, e.g., 0S1 20. 082 60. A baseline B 
represents the known length of the line that connects the centers of projection 
of the two optical systems, 081 , 082. For a point (x.z) on a visible surface in 

35 the field of view of interest, a triangle may be defined by the location of the 
point and by locations of 081. and 0S2. The three sides of the triangle are 
B. R1 , and R2. OS1 and 0S2 can detemiine the angular distance of the 



-14- 



wo 02/21502 PCT/USOl/28094 

triangle from a reference plane, as well as the angles a, and formed by the 
projection rays that connect the surface point with the centers of projection of 
the two optical systems. Angles a, and Oj and baseline B completely deter- 
mine the shape of the triangle. Simple trigonometry can be used to yield the 
5 distance to the surface point (x,z). as well as length of projection ray R1 
and/or R2. 

It is not required that 0S1 20 be implemented as a single unit. For example 
Fig. 2B depicts a structured-light embodiment in which the first optical system 

10 is bifurcate: one portion OSI-A 20-A is a light emitter disposed distance B 
from 0S2 and from the second portion 0S1-B 20-B. a light reflecting device 
such as a mirror. An incoming fan beam generated by OSI-A is deflected by 
mirror 20-B to forni the plane 30. In the orientation of Fig. 28, mimir 20-B is 
inclined about 45" relative to the horizontal plane, and deflection is from a 

1 5 substantially vertical plane to a substantial horizontal plane. In Fig. 2B and 
indeed in a passive light embodiment. 0S2 60 will be a camera aimed at 
angle (J) generally toward the field of view of interest, namely where a user's 
finger or stylus will be to "use" a virtual input device disposed beneath fan 
plane 30. 

20 ^ . 

Triangulation according to the present invention preferably uses a standard 
camera with a planar sensor as 082 60. The nature of OS1 20 distinguishes 
between two rather broad classes of triangulation. In a structured-light 
triangulation, 0S1 20 is typically a laser or the like whose beam may be 

25 shaped as a single line that is moved to project a moving point onto a sur- 
face. Alternatively the laser beam may be planar and moved to project a 
planar curwe. As noted, another class of triangulation system may be termed 
passive triangulation in which a camera is used as OS1 20. Structured-light 
systems tend to be more complex to build and consume more operating 

30 power, due to the need to project a plane of light. Passive systems are less 
expensive, and consume less power. However passive system must solve 
the so-called con-espondence problem, e.g., to determine which pairs of 
points in the two images are projections of the same point in the real worid. 
As will be described, passive non-structured-light triangulation embodiments 

35 may be used, according to the present invention. 

Whether system 10 is implemented as a structured-light system in which 0S1 
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actively emits light and 0S2 is a camera, or as a passive system in which 
0S1 and OS2 are both cameras, infonnation from 0S2 and OS1 will be 
coupled to a processing unit, e.g., 70, that can detemiine what events are 
occuning. In either embodiment, when an object such as 120R intersects the 

5 projection plane 30 associated with 0S1 20, the intersection is detectable. In 
a structured-light embodiment in which 0S1 emits optica! energy, the inter- 
section is noted by optical energy reflected from the intersected object 120R 
and detected by 0S2, typically a camera. In a passive light embodiment, the 
intersection is seen by 0S1 , a camera, and also by 0S2, a camera. In each 

1 0 embodiment, the intersection with plane 30 Is detected as though the region 
of surface 40 underiying the (x,z) plane intersection were touched by object 
120R. System 10 preferably includes a computing system 70 that receives 
data from OS1 . bS2 and uses geometry to detennlne the plane intersection 
position (x,z) from reflected image coordinates in a structured-light embodi- 

1 5 ment, or from camera image coordinates in a passive system. As such, the 
dual tasks of detecting initial and continuing contact and penetration of plane 
30 (e.g., touch events), and detennining intersection coordinate positions on 
the plane may be thus accomplished. 

20 To summarize thus far, touch events are detected and declared when 0S1 
recognizes the intersection of plane 30 with an intmding object such as 120R. 
In a two-camera system, a con-espondence is established between points In 
the perceived image from OS1 and from those in 0S2. Thereafter, 0S2 
camera coordinates are transfomied into touch-area (x-axIs. z-axis) coordi- 

25 nates to locate the (x,z) coordinate position of the event within the area of 
interest in plane 30. Preferably such transfomfiations are carried out by 
processor unit 70, which executes algorithms to compute intersect positions 
In plane 30 from image coordinates of points visible to 0S2. Further, a 
passive light system must distinguish intmding objects from background in 

30 images from OS1 and 0S2. Where system 10 is a passive light system, 
con-espondence needs to be established between the images from camera 
0S1 and from camera 0S2. Where system 10 is a staictured-light system, it 
is desired to minimize Interference from ambient light. 

35 Consider now computation of the (X.Z) intersection or tip position on plane 
30. In perspective projection, a plane in the world and its image are related 
by a transformation called a homography. Let a point (X,Z) on such plane be 
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represented by the column vector P = (X,Z, 1)\ where the superscript T 
denotes transposition. Similarly, let the conBsponding image point be 
represented by p = (x, z, If. 

5 A homography then is a linear transfomiation P - Hp, where H is a 3x3 
matrix. 

This homography matrix may be found using a calibration procedure. Since 
the sensor rests on the surface, sensor position relative to the surface is 

10 constant, and the calibration procedure need be executed only once. For 
calibration, a grid of known pitch is placed on the flat surface on which the 
sensor is resting. The coordinates p, of the image points corresponding to the 
grid vertices P, are measured in the image. A direct linear transform (DLT) 
algorithm can be used to detemiine the homography matrix H. Such DLT 

1 5 transform is known In the art; see for example Richard Hartley and Andrew 
Zisserman. Multiple View Geometry in Computer Vision. Cambridge Univer- 
sity Press. Cambridge, UK, 2000. 

Once H is known, the surface point P con-esponding to a point p in the image 
20 is immediately computed by the matrix-vector multiplication above. Prefera- 
bly such computations are executed by system 70. 

Image con-espondence for passive light embodiments will now be described. 
Cameras 0S1 20 and OS 2 60 see the same plane in space. As a conse- 
quence, mapping between the line-scan camera image from 0S1 and the 
25 camera image from 0S2 will itself be a homography. This is similar to 
mapping between the 0S2 camera image and the plane 30 touch surface 
described above with respect to computation of the tip intercept position. 
Thus a similar procedure can be used to compute this mapping. 

30 Note that since line scan camera 0S1 20 essentially sees or grazes the touch 
surface collapsed to a single line, homography between the two images is 
degenerate. For each 0S2 camera. point there is one 0S1 line-scan image 
point, but for each 0S1 line-scan image point there is an entire line of 0S2 
camera points. Because of this degeneracy, the above-described DLT 

35 algorithm will be (trivially) modified to yield a point-to-line correspondence. 
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By definition, a passive light embodiment of the present invention has no 
control over ambient lighting, and it can be challenging to distinguish Intmding 
intersecting objects or tips from the general bacl<ground. In short, how to tell 
whether a particular image pixel in an 0S1 image or 0S2 image represents 
5 the image of a point on an object such as 120R, or is a point in the general 
background. An algorithm executable by system 70 will now be described. 

Initially, assume one or more background images /, /„ with only the touch 

surface portion of plane 30 in view. Assume that cameras 0S1 and 0S2 can 

10 respond to color, and let RJx, z), GJx. z), BJx, z) be the red, green, and 
blue components of the background image intensity /, at pixel position (x, z). 
Let Sj,Cx. z) be a summary of R^x. z), Gt/x, z), BJx, z) over all images. For 
instance, s^(x, z) can be a three-vector with the averages, medians, or other 
statistics of RJx. z). GJx. z). BJx, z) at pixel position (x, z) over all back- 

1 5 ground images /„, possibly nonmalized to de-emphasize variations in 

image brightness. 

Next, collect a similar summary s, for tip pixels over a new sequence of 
images J, J„. This second summary is a single vector, rather than an 

20 image of vectors as for s^(x, z). In other words, s, does not depend on the 
pixel position (x, z). This new summary can be computed, for instance, by 
asking a user to place finger tips or stylus in the sensitive area of the surface, 
and recording values only at pixel positions (x, z) whose color is very different 
from the background summary s^(x. z) at (x. z), and computing statistics over 

25 all values of j. X, z. 

Then, given a new image with color components c(x, z) = {R(x, z), G(x, z), 
B(x, z)), a particular pixel at (x. z) is attributed to either tip or background by a 
suitable discrimination rule. For instance, a distance dCc„ C2) can be defined 
30 between three-vectors (Euclidean distance Is one example), and pixels are 
assigned based on the following exemplary mle: 

Background if d(c(x,z), s^(x, z)) « d(c(x.z). sj. 

Tip if d(c(x.z), s^(x. z)) » d(c(x.z), s^. 

Unknown otherwise. 

35 
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Techniques for redudng ambient light interference, especially for a 
structured-light triangulation embodiment will now be described. In such 
embodiment, OS2 needs to distinguish between ambient light and light 
pnaduced by the line generator and reflected back by an intruding object. 

5 

Using a first method, 0S1 emits energy in a region of the light spectaim 
where ambient light has litBe power, for instance, in the near infrared. An 
infrared filter on camera 0S2 can ensure that the light detected by the OS2 
sensor is primarily reflected from the object (e.g., 120R) into the lens of 
10 camera 0S2. 

In a second method, 0S1 operates in the visible part of the spectrum, but is 
substantially brighter than ambient light. Although this can be achieved in 
principle with any color of the light source, for indoor applications it may be 
1 5 useful to use a blue-green light source for 0S1 (500 nm to 550 nm) because 
standard fluorescent lights have relatively low emission In this band. Prefera- 
bly 0S2 will Including a matched filter to ensure that response to other 
wavelengths are substantially attenuated. 

20 A third method to reduce effects of ambient light uses a standand visible laser 
source for 0S1 , and a color camera sensor for 0S2. This method uses the 
same bacl«ground subtraction algorithm described above. Let the following 
combination be defined, using the same temninology as above: 
C(x. z) = min { d(c(x,z). s^fx, z)), d(c(x.z), Sj) }. 

25 

This combination will be exactly zero when c(x,z) is equal to the representa- 
tive object tip summary s, (since d(St, sj = 0) and for the background Image 
s^(x. z) (since d(s^(x. z), S|,fx, z)) - 0), and close to zero for other object tip 
image patches and for visible parts of the background. In other words, object 

30 tips and background will be hardly visible in the image C(x,z). By comparison, 
at positions where the projection plane 30 from laser emitter 0S1 intersects 
object tips 120R, the temn d(c(x,z), will be significantly non-zero, which in 
tum yields a substantially non-zero value for C(x,z). This methodology 
achieves the desired goal of identifying essentially only the object tip pixels 

35 illuminated by laser (or other emitter) 0S1 . This method can be varied to use 
light emitters of different colors, to use other distance definitions for the 
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distance of, and to use different summaries sjx, z) and s,. 

In Fig. 1A, if device 80 is a compact system such as a PDA or cell telephone, 
It becomes especially desirable to reduce the size needed to implement the 
present invention. A smaller overall form fact can result if 0S2 is inclined at 

5 some angle 6. as shown in Figs. 1A-1C, 2A, 2B, with respect to plane 30 or 
surface 40. But as angle Q decreases, camera 0S2 sees plane 30 from a 
shallower angle. For a fixed size for the sensitive area of plane 30, i.e., the 
surface rectangle that is to be "touched" by a user object to manipulate an 
underlying virtual input device, as distance B and angle 6 decrease, the 

10 effective area subtended by the field of view decreases. The result is to 
decrease effective OS2 resolution and thus to decrease accuracy of z-depth 
measurements as shown in Fig. 3A, where L denotes a camera lens associ- 
ated with 0S2, whose plane of pixel detectors is shown as a straight line 
labeled OS2. 

15 

As noted In Fig. 3A, moving 0S2 closer to plane 30 results In a shallower 
viewpoint and in a smaller, less accurately perceived, camera image. These 
adverse side effects may be diminished as shovwi In Fig. 3B by tilting the 
plane of pixel detectors in camera 0S2, Indeed tilting almost parallel to plane 

20 30. With the tilted configuration of Fig. 3B, note that a substantially greater 
number of image scan lines intersect the cone of rays from the sensitive area 
on plane 30, which increases depth resolution accordingly. Compare, for 
example, the relatively small distance Dx in Fig. 3A with the larger distance 
Dx' in Fig. 3B, representing the larger number of image scan lines now in use. 

25 Further, as the 0S2 camera sensor plane becomes more parallel to the plane 
of the touch surface or to plane 30, less distortion of the touch surface image 
results. This implies that parallel lines on the touch surface (or on plane 30) 
will remain parallel in the 0S2 camera image. An advantage is the simplifica- 
tion of the homography H to an affine transfomnation (a shift and a scale). 

30 Further, image resolution is rendered more unifomi over the entire sensitive 
area within the field of view of interest. 

Consider now the configuration of Fig. 3C. It is apparent that different points 
on the touch sensitive area of interest on plane 30 are at different distances 
35 from lens L of camera OS2. This means that one cannot focus the entire 
sensitive area of interest precisely if lens L is positioned as shown in Fig. 3A 
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or In Fig. 3B. While closing the camera iris could increase the depth of field, 
resultant images would become dimmer, and image signal-to-noise ratio 
would be degraded. 

5 Accordingly the configuration of Fig. 3C may be employed in which lens L is 
repositioned relative to Fig. 3B. In this configuration, touch surface 30 , the 
camera 0S2 sensor, and camera lens L are said to satisfy the so-called 
Scheimpflug condition, in which their respective planes intersect along a 
common line, a line that is at infinity in Fig. 3C. Further details as to the 

10 Scheimpflug condition may be found at The Optical Society of America. 

Handbook of Optics, Michael Bass, Editor in Chief, McGraw-Hill, lnc..1995. In 
Fig. 3C. when the relevant optical system satisfies this condition, all points on 
touch surface 30 will be in focus. Thus, by using an appropriately tilted sensor 
082, an appropriately positioned lens S that satisfy the Scheimpflug condi- 

1 5 tion, the image seen by 0S2 of points of interest on surface plane 30 will be 
in focus, and will exhibit high resolution with little distortion. But meeting the 
Scheimpflug condition can result in loss of image brightness because the 
angle that the lens subtends when viewed from the center of the sensitive 
area on plane 30 is reduced with respect to the configuration of Fig. 3B. As a 

20 consequence, it may be preferable in some applications to reach a compro- 
mise between sharpness of focus and image brightness, by placing OS2 
camera lens in an orientation intermediate between those of Fig. 3B and Fig. 
3C. Fig. 3D depicts one such intermediate configuration, in which lens L is 
purposely tilted slightly away from a Scheimpflug-satisfying orientation with 

25 respect to planes of 0S2 and 30. 

Such intermediate orientations do not satisfy the Scheimpflug condition, but 
by a lesser degree and therefore still exhibit good focusing than a configura- 
tion whose lens axis points directly towards the center of the sensitive area of 
30 plane 3. Fig. 3E depicts another alternative intermediate configuration, one in 
which the Scheimpflug condition is exactly verified, but the camera sensor 
0S2 is tilted away from horizontal. The configuration of Fig. 3E can achieve 
exact focus but with somewhat lower image resolution and more distortion 
than the configuration of Fig. 3C. 

35 

Fig. 4 is a block diagram depicting operative portions of processor unit 70 

-21- 



BNSDOCtD: <WO QgtfiQgAi j_> 



wo 02/21502 



PCT/USOl/28094 



within system 10. whicli processor unit preferably carries out the various 
triangulation and other calculations described herein to sense and identify 
(x,z) intercepts with the plane of interest 30. As the left portion of Fig. 4, 
information from 0S1 20 and 0S2 30 is input respectively to pixel maps 200- 

5 1 . 200-2. In Fig. 4, 0S1 and 0S2 inputs refer to a stream of frames of 

digitized Images are generated by optical system 1 (20) and optical system 2 
(60) in a planar range sensor system 10, according to the present Invention. 
In a preferred embodiment, optical system generates at least about 30 
frames per second (fps). Higher frame rates are desirable in that at 30 fps, 

1 0 the tip of the user's finger or stylus can move several pixels while "typing" on 
virtual input device between two frames. Pixel map modules 200-1, 200-2 
constmct digital frames from 0S1 and 0S2 in memory associated with 
computational unit 70. Synchronizer module 21 0 ensures that the two optical 
systems produce frames of digitized images at approximately the same time. 

1 5 If desired, a double-buffering system may be implemented to permit construc- 
tion of one frame while the previous frame (in time) is being processed by the 
other modules. Touch detection module 220 detects a touch (e.g., intersec- 
tion of a user finger or stylus with the optical plane sensed by 0S1 ) when the 
outline of a fingertip or stylus appears in a selected row. of the frame. When a 

20 touch is detected, tip detection module 230 records the outline of the corre- 
sponding fingertip into the appropriate pixel map, 200-1 or 200-2. In Fig. 4. in 
a stnjctured-light embodiment where 0S1 is a light beam generator, no pixel 
map is produced, and touch detection will use input from 0S2 rather than 
from 0S1 . 

25 

Touch position module 240 uses tip pixel coordinates from tip detection 
module 230 at the time a touch is reported from touch detection module 220 
to find the (x-z) coordinates of the touch on the touch surface. As noted, a 
touch is tantamount to penetration of plane 30 associated with an optical 
30 emitter 0S1 in a structured-light embodiment, or in a passive light embodi- 
. ment, associated with a plane of view of a camera 0S1 . Mathematical 
methods to convert the pixel coordinates to the X-Z touch position are de- 
scribed elsewhere herein. 

35 Key identification module 260 uses the X-Z position of a touch and maps the 
position to a key identification using a keyboard layout table 250 preferably 
stored in memory associated with computation unit 70. Keyboard layout table 
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250 typically defines the top/bottom/left and right coordinates of each Icey 
relative to a zero origin. As such, a function of key identification module 260 
is to perform a search of table 250 and determine which key contains the (x.z) 
coordinates of the touch point. When the touched (virtual) key is identified, 

5 translation module 270 maps the key to a predetermined KEYCODE value. 
The KEYCODE value is output or passed to an application that is being 
executed on the companion device or system 80 (executing on a companion 
device) that is v^^aiting to receive a notification of a keystroke event. The 
^ application under execution interprets the keystroke event and assigns a 

1 0 meaning to it. For instance, a text input application uses the value to deter- 
mine what symbol was typed. An electronic piano application detennines 
what musical note was pressed and plays that note, etc. 



Alternatively, as shown in Fig. 4, the X-Z touch coordinates can be passed 
1 5 directly to application 280. Application 280 could use the coordinate data to 
control the position of a cursor on a display in a virtual mouse or virtual 
trackball embodiment, or to control a source of digital ink whose locus Is 
shown on a display for a drawing or hand-writing type application in a virtual 
pen or virtual stylus embodiment 

20 

Fig. 5A is a simplified view of system 10 in which virtual device 50 is now a 
control with five regions, and In which the companion device 80, 90 includes a 
monitor. In this embodiment, companion device 80 or 90 is shown with a 
display 1 50 that may include icons 140, one of which is surrounded by a 

25 cursor 31 0 and a user can move using virtual device 50', here a virtual 

trackball or mouse. For example, within virtual device 50', if a portion of the 
user's hand 120R (or stylus) presses virtual region 300-1 , the displayed 
cursor 31 0 on companion device 80, 90 will be commanded to move to the 
left. If virtual region 300-2 is pressed, the cursor should move upward. If 

30 virtual region 300-3 is pressed, the cursor should move to the right, e.g., to 
'selecf the icon of a loaf of bread, and if virtual region 300-4 is pressed, the 
cursor should move towards the bottom of the display on device 80, 90. If 
user presses the fifth region 300-5, a "thumbs up" region, companion device 
80, 90 knows that the user-selection is now complete. In Fig. 5A, if the user 

35 now presses region 300-5, the "hotdog" icon is selected. If device 80, 90 
were a kiosk in a supennari<et, for example, selecting the "hotdog" icon might 
bring up a display shovinng where in the market hotdogs are to be found, or 
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the price of various brands of hot dogs being sold, or device 80, 90 might 
even dispense hotdogs. If device 80, 90 were used in a transportation 
setting, the icons (or words) might be various destinations, and device 80 or 
90 could Indicate routes, schedules, and fares to the destinations, and could 

5 even dispense ticlcets for use on a bus, a subway, ari airiine, a boat, etc. A 
user could, for example, press two regions of input device 50' representing 
trip originating point and trip destination point, whereupon system 10 could 
cause a display of appropriate transportation vehicles, schedules, fares, etc. 
to be displayed and, if desired, printed out. It will be appreciated that informa- 

1 0 tion generated by system 1 0 may simply be raw (x,z) coordinates that a 

software application executed by a companion device may use to reposition a 
cursor or other Infonriation on a display. 

It is understood in Fig. 5A that virtual device 50' is passive; its outline may be 
1 5 printed or painted onto an underiying woric surface, or perhaps its outline can 
be projected by system 10. The various regions of interest in virtual device 
50 may be identified in tenns of coordinates relative to the x-z plane. Con- 
sider the Infonnation in Table 1 , below, which corresponds to infonnation in 
keyboard layout 250 In Fig. 4: 

20 

TABLE 1 



REGION 


TOP 


BOTTOM 


LEFT 


RIGHT 


U 


-2 


-1 


-1 


1 


B 


1 


2 


-1 


1 


R 


-1 


1 


1 


2 


L 


-1 


1 


-2 


-1 




-1 


1 


-1 


1 



When the user's finger (or stylus) touches a region of virtual input device 50, 
30 touch position module 240 (see Fig. 4) determines the (x,z) coordinates of the 
touch point 110. In Fig. 5, touch point 110 is within "B" region 300-4. Key 
identification module 260 uses the keyboard layout 250 Infonnation, In this 
example as shown in Table 1 , to detemiine where In the relevant (x,z) plane 
the touch point coordinates occur. By way of example, assume touch coordl- 
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nates (x,z) are (1.5.0.5). A search routine preferably stored in memory 
associated with unit 70 (see Fig. 1A) and executed by unit 70 detennines that 
1< X < 2, and -1 < z < 1 . Searching infomnation in Table 1 , the key identifica- 
tion module will detennine that touch point 110 falls within entry B. In this 
5 example, companion device 80 and 90 receives data from system 10 advising 
that region B has been touched. Processor unit 70 in system 10 can cause 
the companion device to receive such other information as may be required 
to perform the task associated with the event, for example to move the cursor 
downward on the display. 

10 

Fig. 5B depicts an embodiment of system 10 similar to that shown in Fig. 1A. 
In Fig. 58 the virtual input device 50 is a computer keyboard and the compan- 
ion device 80, 90 is a mobile transceiver, a cellular telephone for example. It 
is to be understood that system 10 could in fact be implemented within device 

1 5 80, 90. As such, OS1 might emit fan-beam 30 from a lower portion of device 
80. 90. and 0S2 might be disposed In an upper portion of the same device. 
The virtual input device 50 could, if desired, be projected optically ftom device 
80, 90. Alternatively virtual input device 50 might be printed on a foldable 
substrate, e.g., plastic, paper, etc. that can be retained within device 80, 90, 

20 then removed and unfolded or unrolled and placed on a flat work surface in 
front of device 80, 90. The location of virtual input device 50 in front of device 
80, 90 would be such that 0S1 can emit a fan-beam 30 encompassing the 
virtual input device, and 0S2 can detect intersection 110 of an object, e.g., a 
user's finger or cursor, etc., with a location in the fan-beam overlying any 

25 region of interest in virtual input device 50. 



In Fig. 58, 0S2 will not detect reflected optical energy until object 120R 
intercepts fan-beam 130, whereupon some optical energy emitted by 0S1 will 
be reflected (130) and will be detected by 0S2. Relative to the (x,z) coordi- 

30 nate system shown in Fig. 1 A, the point of interception 1 1 0 is approximately 
location (13,5). Refenrtng to Fig. 4, it is understood that keyboard layout table 
250 will have at least one entry for each virtual key, e.g., "1", "2", ... "Q", "W", 
... "SHIFT" defined on virtual input device 50. An entry search process similar 
to that described with respect to Fig. 5A is candied out, preferably by unit 70, 

35 and the relevant virtual key that underlies touch point 1 1 0 can be identified. 
In Fig. 58, the relevant key is "1°, which letter T is shown on display 150 as 
part of e-mail message text 140 being input into cellular telephone 80, 90 by a 
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portion of the user's hand 1 20R (or by a stylus). The ability to rapidly touch- 
type messages into cellular telephone 80, 90 using virtual keyboard 50, as 
contrasted with laboriously inputting messages using the cellular telephone 
keypad will be appreciated. 

5 

In Fig. 5C. an embodiment of system 10 is shown in which the workspace 40 
is a vertical wall, perhaps in a store or mall, and virtual input device 50 is also 
vertically disposed. In this embodiment, virtual input device 50 is shown with 
several icons and/or words 320 that when touched by a user's hand 120. e.g., 

1 0 at touch point 110, will cause an appropriate text and/or graphic image 1 40 to 
appear on display 150 in companion device 80, 90. In the example shown, 
icons 320 may represent locations or departments in a store, and display 150 
will interactively provide further infomnation in response to user touching of an 
icon region. In a mall, the various icons may represent entire stores, or 

1 5 department or regions within a store, etc. The detection and localization of 
touchpoints such as 1 10 is preferably carried out as has been described with 
respect to the embodiments of Figs. 3A and 3B. Preferably processor unit 70 
within system 10 executes software, also stored within or loadable into 
processor unit 70, to detemiine what icon or text portion of virtual input device 

20 50 has been touched, and what commands and/or data should be communi- 
cated to host system 80, 90. 



In the embodiment of Fig. 5C, if the virtual input device 50 is apt to be 
changed frequently, e.g., perhaps it is a menu in a restaurant where display 

25 150 can provide detailed infomiation such as calories, contents of sauces, 
etc., device 50 may be back projected from within wall 40. Understandably if 
the layout and location of the various icons 320 change, mapping infonnation 
stored within unit 70 in system 10 will also be changed. The ability to rapidly 
change the nature and content of the virtual input device without necessarily 

30 be locked-into having icons of a fixed size in a fixed location can be very 
useful. If desired, some icons may indeed be fixed in size and location on 
device 50, and their touching by a user may be used to select a re-mapping 
of what is shown on input device 50, and what is mapped by software within 
unit 70. It is understood that in addition to simply displaying infomnation, 

35 which may Include advertisements, companion device 80. 90 may be used to 
issue promotional coupons 330 for users. 
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Turning now to Fig. 6, the manner of registering a touch event and localizing 
its position is detemnined by system 10 in a manner, depending upon whether 
system 10 is a structured-light system or a passive light system. As noted 
earlier, in a stmctured-light system 0S1 may be a line generating laser 

5 system, and in a passive light system, 0S1 may be a digital camera. Each 
system defines a plane 30 that when intercepted by an object such as 120R 
will define a touch event whose (x,z) coordinates are to then be detemiined. 
Once the (x.z) coordinates of the virtual touch are detemiined, the present 
invention can decide what input or command was intended by the person 

10 using the system. Such input or command can be passed to a companion 
device, which device may in fact also house tiie present invention. 

If system 1 0 is a passive light system, a touch event is registered when the 
outline of a fingertip appears in a selected frame row of 0S1 , a digital cam- 
15 era. The (x.z) plane 30 location of the touch Is determined by the pixel 

position of the corresponding object tip (e.g., 120R) in 0S2, when a touch is 
detected.in OS1 . As shown in Fig. 6, the range or distance from camera 0S1 
to the touch point is an afflne function of the number of pixels from the "near" 
end of the pixel frame. 

20 

As noted. In a stoictured-light embodiment, 0S1 will typically be a laser line 
generator, and 0S2 will be a camera primarily sensitive to wavelength of the 
light energy emitted by 0S1 . As noted, this can be achieved by installing a 
nan-owband light filter on 0S2 such that only wavelength conresponding to ^ 

25 that emitted by 0S1 will pass. Alternatively, 0S2 can be understood to 
include a shutter that opens and closes in synchronism to pulse output of 
0S1 , e.g., OS2 can see optical energy only at time that 0S1 emits optical 
energy. In either embodiment of a stnjctured-light system, 0S2 preferably 
will only detect objects tiiat intercept plane 30 and thus reflect energy emitted 

30 byOS1. 

In the above case, touch sense detection and range calculation are carried 
out by system 10. Thus, a touch event is registered when the outline of an 
object, e.g.. fingertip 120R, appears within the viewing range of 0S2. As in 
35 the above example, range distance may be calculated as an affine function of 
the number of pixels from the "near" end of pixel frame. 
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10 



15 



A further example of analytical steps carried out in Fig. 4 by the present 
invention will now be given. Assume that the virtual input device is a l^ey- 
board 50, such as depicted in Fig. 1A. and that system 10 is expected to 
output information comprising at least the scan code corresponding to the 
virtual key that the user has "touched" on virtual keyboard 50. In Fig. 1 A and 
Fig. 2A. assume that the upper portion (e.g.. the row with virtual keys "ESC", 
-Fr. "F2". etc.) is a distance of about 20 cm from optical system 0S1 20. 
Assume that camera 0S2 60 is mounted on a PDA or other device 80 that is 
about 10 cm tall, and is placed at a known angle = 120° relative to the 
plane 30. Assume too that camera 0S2 60 has a lens with a focal length of 
about 4 mm. and a camera sensor arrayed with 480 rows and 640 columns. 

The Z coordinate of the upper left comer of virtual keyboard 50 is set by 
convention to be x=0 and z=0, e.g.. (0,0). The homography Hthat maps 
points in the image to points on the virtual device depends on the tilt of 
camera 0S2 60. An exemplary homography matrix for the configuration 
above is as follows: 



■ 0.133 
-0.194 
0.0 



• 0.061 32.9 
0.0 15.1 
0.0 1.0 



20 The above matrix preferably need be detennined only once during a calibra- 
tion procedure, described elsewhere herein. 

Referring now to Fig. 1A and Fig. 7. assume that user 120L touches the 
region of virtual keyboard 50 corresponding to the letter T". which letter T" 
25 may be printed on a substrate to guide the user's fingers or may be part of an 
image of the virtual input device pertiaps projected by system 10. Using the 
system of coordinates defined above, key T' may be said to lie between 
horizontal coordinates = 10.5 and = 12.4 cm, and between vertical 
coordinates = 1.9 and = 3.8 cm. as shown in Fig. 7. 

Refening now to Fig. 6, before the user's finger 120L (or stylus) intersects the 
plane of sensor 0S1 20, the latter detects no light, and sees an image made 
of black pixels, as illustrated shown in vignette 340 at the figure bottom. 
However, as soon as the user-object intersects optical plane 30, the intersec- 



30 
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tion event or interface becomes visible to 0S1 20. 0S1 20 now generates an 
image similar to the one depicted in vignette 350 at the bottom of Fig. 6. 
When the downward moving tip 110 of user-object (e.g.. finger 120L) reaches 
surface 40, more of the finger becorries visible. The finger contour may now 
5 be determined, e.g., by unit 70 using edge detection. Such determination is 
depicted at the bottom of Fig. 6 as TOUCH" event vignette 360. Touch 
detection module 220 in Fig. 4 then detemnines that the user-object has 
touched surface 40, and infomns tip detection module 230 of this occunrence. 

1 0 As seen in Fig. 1 A, the virtual T key is found in the second row of virtual 

keyboanj 50, and is therefore relatively close to sensor 0S1 20. In Fig. 6. this 
situation conBsponds to the fingertip in position 110'. As further shown in Fig. 
6, the projection of the bottom of the fingertip position 110' onto the sensor of 
optical system 0S2 60 is relatively close to the top of the image. The edge of 

1 5 the fingertip image thus produced is similar that shown in vignette 370 at the 
top of Fig. 6. In vignette 370, the two gray squares shown represent the 
bottom edge pixels of the fingertip. 

Had the user instead struck the spacebar or some other key closer to the 
20 bottom of virtual keyboard 50, that is, further away from the sensor 0S1 20, 
the situation depicted by fingertip position 1 10 in Fig. 6 would have arisen. 
Such a relatively far location on the virtual keyboard is mapped to a pixel 
closer to the bottom of the image, and an edge image similar to that sketched 
in vignette 380 at the top of Fig. 6 would have Instead arisen. Intennediate 
25 virtual key contact situations would produce edge images that are more 
similar to that depicted as vignette 390 at the top of Fig. 6. 

In the above example in which virtual key T is pressed, tip detection module 
230 in Fig. 4 runs an edge detection algorithm, and thereby finds the bottom 
30 center of the "blob" representing the generalized region of contact to be at 
image row 65 and column 492. The homogeneous image coordinate vector p. 
given below is therefore formed: 



P = 



65 
492 
1 
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10 



The homogeneous image coordinate vector p is then multiplied by the. 
homography matrix H to yield the coordinates P of the user fingertip in the 
frame of reference of the virtual keyboard: 

"0.1 33x 65 - a06 Ix 492 + 329 x 1 
P = i?p= -ai94x65+a0x492+lSlxl 
a0x65+a0x492+1.0xl 



The user-object or finger 120L is thus determined to have touched virtual 
keyboard 50 at a location point having coordinates x = 1 1.53 and z = 2.49 cm. 
Key identification module 260 in Fig. 4 searches keyboard layout 250 for ^ 

key such that <1 L53<x^ and y^^^^9<y„„. 









2.49 




_1.00_ 



These conditions are satisfied for the virtual T" key because 
10.5<11.53<12.4, and 1 .9<2.49<3.8. Refening to Fig. 4, key identification 
module 260 therefore determines that a user-object is touching virtual key T" 
on virtual keyboard 50, and informs translation module 270 of this occur- 
15 rence. 

The occurrence need not necessarily be a keystroke. For example, the user- 
object or finger may have earlier contacted the T" key and may have re- 
mained in touch contact with the key thereafter. In such case, no keystroke 
20 event should be communicated to application 280 running on the companion 
device 80 or 90. 

Key translation module 270 preferably stores the up-state or down-state of 
each key internally. This module detemnines at every frame whether any key 

25 has changed state. In the above example, if the key *T' is found to be in the 
down-state in the current frame but was in the up-state in the previous frame, 
translation module 270 sends a KEYCODE message to application 280. The 
KEYCODE code will include a 'KEY DOWN' event identifier, along with a 
'KEY ID* tag that identifies the T" key, and thereby informs application 280 

30 that the T" key has just be "pressed" by the user-object. If the "T" key were 
found to have been also in the down-state during previous frames, the 
KEYCODE would include a 'KEY HELD* event identifier, together with the 
'KEY ID' associated with the "T" key . Sending the 'KEY HELD' event at each 
frame (excepting the first frame) in which the key is in the down-state frees 
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application 280 from liaving to maintain any state about the keys. Once the 
T' key is found to be in the up-state in the cun-ent frame but was in the down- 
state in previous frames, translation module 270 sends a KEYCODE with a 
'KEY UP' event identifier, again with a 'KEY ID' tag identifying the T" key. 
5 infonning application 280 that the T" key was just "released" by the user- 
object. 

From the foregoing, it will be appreciated that it suffices that frame images 
comprise only the tips of the user-object, e.g., fingertips. The various em- 

1 0 bodiments of the present Invention use less than full three-dimensional image 
infomnation acquired from within a relatively shallow volume defined slightly 
above a virtual input or virtual transfer device. A system implementing these 
embodiments can be relatively inexpensively fabricated and operated from a 
self-contained battery source. Indeed, the system could be constructed 

15 within common devices such as PDAs, cellular telephones, etc. to hasten the 
input or transfer of infomnation from a user. As described, undesired effects 
from ambient light may be reduced by selection of wavelengths in active light 
embodiments, by synchronization of camera(s) and light sources, by signal 
processing techniques that acquire and subtract-out images representing 

20 background noise. 

Modifications and variations may be made to the disdosed embodiments 
without departing from the subject and spirit of the invention as defined by the 
following claims. 



30 



35 



I 
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CLAIMS - 

1 . A method to obtain information from an interaction of a user-object 
with a virtual transfer device, the method comprising the following steps: 

(a) defining a plane substantially parallel to and spaced-above a 
presumed location of said virtual transfer device; 

(b) sensing when a user-object penetrates said plane to interact with 
said virtual transfer device; and 

(c) detemnining relative position of a portion of said user-object on said 

plane. 



2. The method of claim 1 , further including: 

(d) transferring to a companion device information commensurate with 
position of said user-object penetration relative to said virtual transfer device; 
wherein user-object interaction with said virtual transfer device affects 
1 5 operation of said companion device. 

3. The method of claim 1 , wherein step (a) includes generating a 
plane of optical energy, and wherein step (b) includes detecting a reflected 
portion of said optical energy when said user-object penetrates said plane. 

20 

4. The method of claim 1, wherein step (a) includes providing a 
camera that defines said plane; and step (b) includes observing interaction of 

. said user-object with said plane. 

25 5. The method of claim 1 , wherein at least one of step (b) and step (c) 

is carried out using triangulation analysis. 

6. The method of claim 2, wherein said companion device includes at 
least one of (i) a PDA , (ii) a portable communication device, (iii) an electronic 
30 device, (iv) an electronic game device, and (v) a musical instrument, and said 
virtual transfer device is at least one of (I) a virtual keyboard, (II) a virtual 
mouse, (III) a virtual trackball, (IV) a virtual pen, (V) a virtual trackpad, and 
(VI) a user-interface selector. 

35 7. The method of claim 1 , wherein said virtual transfer device is 

mapped to a work surface selected from at least one of (i) a table top, (ii) a 
desk top, (iii) a wall, (iv) a point-of-sale appliance, (v) a point-of-service 
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appliance, (vi) a kiosk, (vii) a surface in a vehicle, (viii) a projected display. Ox) 
a physical display, (x) a CRT. and (xi) an LCD. 

8. The method of claim 1 . wherein at least one of step (a) and step (b) 
5 includes providing a camera having a lens and an image plane, and further 

including improving at least one of resolution and depth of field of said 
camera by tilting at least one of said lens and said image plane. 

9. The method of claim 1 , wherein: 

1 0 step (a) includes defining said plane using an optical source; and 

step (b) includes providing a camera to sense penetration of said 

plane. 

10. The method of claim 9, further including: 

1 5 synchronizing operation of said optical source and said camera; 

wherein effects of ambient light upon accuracy of infomnation obtained 
at at least one of step (b) and step (c) are reduced. 

11. The method of claim 9. wherein said optical source emits optical 
20 energy bearing a signature used to reject ambient light. 

12. The method of claim 1 , wherein: 

step (a) includes defining said plane with a first camera; 
step (b) includes providing a second camera to sense penetration of 
25 said plane; and further including: 

directing a source of optical energy generally toward said virtual 

transfer device; and 

synchronizing operation of said source of optical energy and at least 
one of said first camera and said second camera; 
30 wherein effects of ambient light upon accuracy of information obtained 

at at least one of step (b) and step (c) are reduced. 

13. The method of claim 1. wherein: 

step (b) includes acquiring infonmation generated by ambient light by 
35 sensing when said user-object is distant from said plane; and 

at least one of step (b) and step (c) includes subtracting said infomia- 
tion from infonnation acquired when said user-object interacts with said 
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transfer device; 

wherein effects of ambient liglit are reduced. 

1 4. A system enabling a user-manipulated user-object used with a 
5 virtual transfer device to transfer Information to a companion device, the 

system comprising: 

a central processor unit including memory storing at least one software 

routine; 

a first optical system defining a plane substantially parallel-to and 
1 0 spaced-above a presumed location of said virtual transfer device; 

a second optical system having a relevant field of view encompassing 
at least portions of said plane and responsive to user-object penetration of 
said plane to interact with said virtual transfer device; 

means for determining relative position of a portion of said user-object 

15 on said plane; 

wherein said system transfers Information to said companion device 
enabling user-object with said virtual transfer device to affect operation of 
said companion device. 

20 15. The system of claim 14, wherein said means for detemiining 

Includes detennining said relative position using triangulation analysis. 

16. The system of claim 1 4, wherein said means for detemnlning 
includes said processor unit executing said routine to detemiine said relative 

25 position. 

17. The system of claim 14, wherein: 

said first optical system includes means for generating a plane of 
optical energy; and 

30 said second optical system includes a camera sensor that detects a 

reflected portion of said optical energy when said user-object penetrates said 
plane. 

18. The system of claim 14, wherein: 

35 said first optical system includes at least one of (1) a laser to generate 

said plane, and (ii) an LED to generate said plane; and 

said second optical system includes a camera sensor that detects a 
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reflected portion of said optical energy when said user-object penetrates said 
plane. 

19. Tiie system of claim 14, further including means for enhancing 
5 responsiveness of said second optical system to said user-object penetration 

while decreasing said responsiveness to ambient light. 

20. The system of claim 1 9. wherein said means for enhancing 
includes at least one of (a) providing a signature associated with generation 

10 of said plane, (b) selecting a common wavelength for energy within said plane 
defined by said first optical system and for responsiveness of said second 
optical system, and (c) synchronizing operation of said first optical ^stem 
and operation of said second optical system. 

15 21 . The system of claim 14, wherein said first optical system 

includes a first camera sensor that defines said plane. 

22. The system of claim 14, wherein: 

said first optical system includes a first camera sensor that defines 
20 said plane; 

said second optical system includes a second camera to sense said 
penetration; 

and fijrther including: 

a source of optical energy directed generally toward said virtual 

25 transfer device; and 

means for synchronizing operation of at least two of same first optical 
system, said second optical system, and said source of optical energy: 

wherein effects of ambient light upon accuracy of information obtained 
with said system are reduced. 

30 

23. The system of claim 14, wherein: 

said first optical system includes a generator of optical energy of a 
desired wavelength; and 

said second optical system is sensitive substantially only to optical 

35 energy of said desired wavelength. 

24. The system of claim 14, wherein said companion device in- 
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dudes at least one of (i) a PDA , (ii) a portable communication device, (iii) an 
electronic device, (iv) an electronic game device, and (v) a musical instru- 
ment, and said virtual transfer device is at least one of (I) a virtual keyboard, 
(II) a virtual mouse. (Ill) a virtual tracl<ball, (IV) a virtual pen. (V) a virtual 
5 trackpad, and (VI) a user-interface selector. 

25. The system of claim 14, wherein said virtual transfer device is 
mapped to a work surface selected from at least one of (i) a table top, (ii) a 
desk top, (iii) a wall, (iv) a point-of-sale appliance, (v) a point-of-service 

10 appliance, (vi) a kiosk, (vii) a surface in a vehicle, (viii) a projected display, (ix) 
a physical display, (x) a CRT, and (xi) an LCD. 

26. The system of claim 14. wherein at least one of said first operating 
system and said second operating system is a camera sensor having a lens 

15 and an image plane; 

wherein at least one of said lens and sakl image plane is tilted to 
enhance at least one of resolution and depth of field. 

27. The system of claim 14, further including means for enhancing 
20 distinguishment of said user-object from a background object. 



25 



30 



35 
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FIG. 3B 
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